ROC curves and nonrandom data ∗ Jonathan Aaron
نویسنده
چکیده
This paper shows that when a classifier is evaluated with nonrandom test data, ROC curves differ from the ROC curves that would be obtained with a random sample. To address this bias, this paper introduces a procedure for plotting ROC curves that are inferred from nonrandom test data. I provide simulations and an example with wine data to illustrate the procedure as well as the magnitude of bias that is found in standard ROC curves generated from nonrandom test data.
منابع مشابه
Making classifier performance comparisons when ROC curves intersect
The ROC curve is one of the most common statistical tools useful to assess classifier performance. The selection of the best classifier when ROC curves intersect is quite challenging. A novel approach for model comparisons when ROC curves show intersections is proposed. In particular, the relationship between ROC orderings and stochastic dominance is investigated in a theoretical framework and ...
متن کاملPRROC: computing and visualizing precision-recall and receiver operating characteristic curves in R
Precision-recall (PR) and receiver operating characteristic (ROC) curves are valuable measures of classifier performance. Here, we present the R-package PRROC, which allows for computing and visualizing both PR and ROC curves. In contrast to available R-packages, PRROC allows for computing PR and ROC curves and areas under these curves for soft-labeled data using a continuous interpolation betw...
متن کاملUpper and Lower Bounds of Area Under ROC Curves and Index of Discriminability of Classifier Performance
Area under an ROC curve plays an important role in estimating discrimination performance – a well-known theorem by Green (1964) states that ROC area equals the percentage of correct in two-alternative forcedchoice setting. When only single data point is available, the upper and lower bound of discrimination performance can be constructed based on the maximum and minimum area of legitimate ROC c...
متن کاملComparison of correlated receiver operating characteristic curves derived from repeated diagnostic test data.
RATIONAL AND OBJECTIVES It is common to administer the same diagnostic test more than once to the same set of patients. The purpose of this study was to develop two statistical methods for estimating and comparing correlated receiver operating characteristic (ROC) curves for data derived from repeated diagnostic tests. MATERIAL AND METHODS Parametric and semiparametric transformation models w...
متن کاملEvaluating the ROC performance of markers for future events.
Receiver operating characteristic (ROC) curves play a central role in the evaluation of biomarkers and tests for disease diagnosis. Predictors for event time outcomes can also be evaluated with ROC curves, but the time lag between marker measurement and event time must be acknowledged. We discuss different definitions of time-dependent ROC curves in the context of real applications. Several app...
متن کامل